Use of electronic health records to characterize patients with uncontrolled hypertension in two large health system networks

Background Improving hypertension control is a public health priority. However, consistent identification of uncontrolled hypertension using computable definitions in electronic health records (EHR) across health systems remains uncertain. Methods In this retrospective cohort study, we applied two computable definitions to the EHR data to identify patients with controlled and uncontrolled hypertension and to evaluate differences in characteristics, treatment, and clinical outcomes between these patient populations. We included adult patients (≥ 18 years) with hypertension (based on either ICD-10 codes of hypertension or two elevated blood pressure [BP] measurements) receiving ambulatory care within Yale-New Haven Health System (YNHHS; a large US health system) and OneFlorida Clinical Research Consortium (OneFlorida; a Clinical Research Network comprised of 16 health systems) between October 2015 and December 2018. We identified patients with controlled and uncontrolled hypertension based on either a single BP measurement from a randomly selected visit or all BP measurements recorded between hypertension identification and the randomly selected visit). Results Overall, 253,207 and 182,827 adults at YNHHS and OneFlorida were identified as having hypertension. Of these patients, 83.1% at YNHHS and 76.8% at OneFlorida were identified using ICD-10-CM codes, whereas 16.9% and 23.2%, respectively, were identified using elevated BP measurements (≥ 140/90 mmHg). A total of 24.1% of patients at YNHHS and 21.6% at OneFlorida had both diagnosis code for hypertension and elevated blood pressure measurements. Uncontrolled hypertension was observed among 32.5% and 43.7% of patients at YNHHS and OneFlorida, respectively. Uncontrolled hypertension was disproportionately higher among Black patients when compared with White patients (38.9% versus 31.5% in YNHHS; p < 0.001; 49.7% versus 41.2% in OneFlorida; p < 0.001). Medication prescription for hypertension management was more common in patients with uncontrolled hypertension when compared with those with controlled hypertension (overall treatment rate: 39.3% versus 37.3% in YNHHS; p = 0.04; 42.2% versus 34.8% in OneFlorida; p < 0.001). Patients with controlled and uncontrolled hypertension had similar incidence rates of deaths, CVD events, and healthcare visits at 3, 6, 12, and 24 months. The two computable definitions generated consistent results. Conclusions While the current EHR systems are not fully optimized for disease surveillance and stratification, our findings illustrate the potential of leveraging EHR data to conduct digital population surveillance in the realm of hypertension management. Supplementary Information The online version contains supplementary material available at 10.1186/s12872-024-04161-x.


Introduction
Improving hypertension control is a public health priority in the US [1].Approximately half of US adults have hypertension, but fewer than half have their blood pressure (BP) controlled (< 140/90 mmHg) [2].Uncontrolled BP can result from a variety of factors, including nonadherence to medication regimens due to side effects, cost, or lack of understanding, as well as lifestyle factors like poor diet, high sodium intake, lack of physical activity, and excessive alcohol consumption [3].Additionally, underlying medical conditions such as obesity, diabetes, and chronic kidney disease, along with inadequate management or lack of timely medical follow-up, can make BP more difficult to control [4].Uncontrolled BP increases the risk of severe health issues such as stroke, heart attack, kidney disease, heart failure, and cognitive decline [5].Understanding individuals with uncontrolled hypertension, their treatments, and outcomes is essential for public health and healthcare system interventions.Electronic health record (EHR) data offer a unique opportunity to study uncontrolled hypertension due to their access to extensive, long-term clinical information compared to other sources [6,7].
However, consistent identification of uncontrolled hypertension through EHRs across health systems remains challenging.There is no specific code for uncontrolled hypertension, making diagnosis reliant on numerous observations over time.Utilizing computable definitions that incorporate various EHR data elements to identify patients with the condition can be beneficial and has the potential for large-scale digital population health surveillance [8][9][10].In particular, computable definitions can enhance current practices in identifying uncontrolled hypertension by providing standardized criteria that can be consistently applied across various health systems.These definitions enable the automatic extraction and analysis of relevant data from EHRs, reducing the reliance on manual chart reviews.By incorporating real-world data on BP readings and other relevant factors, computable definitions facilitate more timely and accurate identification of patients with uncontrolled hypertension, ultimately leading to better-targeted interventions and improved patient outcomes.Despite clinical guidelines providing a basic definition of uncontrolled hypertension [5,11], few studies have created computable definitions based on structured diagnosis codes, vital signs, and common data models for clinical research and practice.Additionally, EHR data can be configured differently in terms of frequency, context, and time, making it unclear how different definitions affect patient identification.
Accordingly, the objective of this study is to develop and apply two computable definitions (i.e.algorithmic criteria based on EHR data) to consistently identify patients with controlled and uncontrolled hypertension using EHR data from two large health system networks: Yale New Haven Health System, serving Connecticut and parts of Rhode Island and New York, and OneFlorida, serving the state of Florida.We also aimed to compare characteristics, treatment patterns, and clinical outcomes of patients with controlled and uncontrolled hypertension.

Project origination
The National Evaluation System for health Technology Coordinating Center (NESTcc) is an organization established through grant funding to the Medical Device Innovation Consortium by the US Food and Drug Administration in 2016 to promote the development of robust real-world evidence for regulatory decisionmaking [12].NESTcc currently includes 19 Network Collaborators (health care providers, academic research institutions, payers, and professional registries) that collect, curate, and analyze real-world evidence that may be used for regulatory decision-making.
This study was proposed to NESTcc by Medtronic Inc, which is currently studying its Symplicity ™ Renal Denervation System in patients with hypertension in a series of sham-controlled and real-world studies intended to support a premarket approval application in the USA [13,14].After an independent review of the study concept and subsequent proposal, NESTcc funded the project.Among its Network Collaborators, NESTcc identified a large health system and a clinical research network interested in pursuing the proposed project, each of which had extensive experience with EHR data analysis: Yale-New Haven Health System (YNHHS) and the OneFlorida Clinical Research Consortium (OneFlorida).Medtronic and the two NESTcc Network Collaborators, with YNHHS serving as the lead, developed a full research plan that was approved by NESTcc.Institutional Review Board approval was obtained at Yale University and University of Florida.The study followed the guidelines for cohort studies, described in the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies.

Data sources
The data sources for this study consisted of EHR data from YNHHS and OneFlorida.YNHHS is a large academic health system consisting of five distinct hospital delivery networks and associated ambulatory clinics located in Connecticut and Rhode Island.The system provides services for approximately two million patients annually.OneFlorida is a statewide clinical research network including 16 partner health systems providing services for 40% of Florida's population.
Both YNHHS and OneFlorida conformed data to the National Patient-Centered Clinical Research Network (PCORnet) common data model via extract/transform/ load software [15,16], ensuring data elements were standardized and consistent across the two sites.Both sites conducted data quality assessments in a standardized fashion.Data quality was assessed by performing domain value validation checks periodically, assessing for data relevance, reliability, and robustness.Cross-validation was performed on the various data sources to assess for any data gaps and to ensure data completeness.In addition to internal quality checks at each site, the Yale team and the OneFlorida team met regularly to resolve issues regarding the validity and robustness of the results.For this analysis, we used a versioned extract of the PCORnet common data model from October 1, 2015, when International Classification of Diseases-10th Edition-Clinical Modification (ICD-10-CM) diagnosis was introduced, through December 31, 2018.

Study population
The study population included adult patients (≥ 18 years) who met the clinical criteria of hypertension between October 1, 2015 and December 31, 2018 if (1) they had an ICD-10-CM diagnosis code for hypertension (I10, I11, I12, I13, I15, I16) associated with at least one ambulatory visit, or (2) in the absence of a diagnosis, they had at least two elevated BP measurements (systolic BP [SBP] ≥ 140 mmHg or diastolic BP [DBP] ≥ 90 mmHg) recorded in the EHR at two separate ambulatory visits occurring at least one day apart within a 6-month period at any time between October 1, 2015 and December 31, 2018.The date of the hypertension diagnosis or the date of the second elevated BP measurement was considered as the date when patients met the criteria of hypertension.Numerous studies in the literature have supported the validity of using these methods for identifying patients with hypertension (with a median area under the receiver operator characteristic curve of 0.95) [17,18].We used BP ≥ 140/90 mmHg as the cutoff for hypertension because this was the definition of hypertension at the time from which most of the data were extracted [11].We did not include medication use to define hypertension because drugs like beta-blockers can be prescribed for other conditions, such as arrhythmias or heart failure.By focusing on BP measurements and diagnosis codes, we aim to identify hypertensive patients irrespective of their medication use.
We excluded patients with fewer than 3 months follow-up time, female patients with diagnostic or procedural evidence of pregnancy (ICD-10-CM [Z33, Z34, O80, O82, O00, O01, O02, O03, O04, O07, O08]) and patients receiving dialysis (ICD-10-CM [Z99.2]).We also included only those BP measurements recorded at ambulatory visits, excluding BP measurements from inpatient and emergency department (ED) encounters because BP measurements in those encounters could be elevated due acute conditions.For any visit with multiple BP measurements recorded, the lowest SBP measurement and lowest DBP measurement were used to ascertain hypertension status to minimize the potential for false positives in diagnosing hypertension.We required the visits to be in-person visits where BPs were measured and recorded.Virtual visits were excluded from the analysis.We extended our observation period until the end of 2019 to ensure at least 12-month follow-up for patients.

Definitions of controlled and uncontrolled hypertension
As there are multiple ways in which the EHR data elements are assembled in terms of frequency, clinical context, and time, we tested two different approaches to operationalize the definitions of controlled and uncontrolled hypertension.Specifically, we randomly selected one ambulatory encounter with a BP measurement occurring at least 3 months after patients met the hypertension criteria, within the period from October 1, 2015, to December 31, 2018.This encounter was designated as the index encounter.then applied two approaches to define controlled/uncontrolled hypertension.Our rationale for selecting a random date to minimize selection bias.If we had chosen the most recent encounter as the index date, our sample would have been biased toward patients with shorter follow-up times, making it less likely for them to achieve blood pressure control.
Conversely, if we had selected the earliest encounter, our sample would have been biased towards patients with longer follow-up times, offering more opportunity for the patients to achieve blood pressure control (and experience poor clinical outcomes).By randomly selecting a date, we ensured that the follow-up times for our sample would be more balanced overall.In addition, we required patients to have at least 3 months after hypertension identification before being included in the study.
In approach 1, hypertensive patients were considered to have controlled hypertension if more than 50% of their SBP measurements were < 140 mmHg and DBP measurements were < 90 mmHg among the measured BPs on all ambulatory encounters from the date when patients met the hypertension criteria up to and including the index encounter.Hypertensive patients were considered to have uncontrolled hypertension if 50% or more of SBPs were ≥ 140 mmHg or DBPs were ≥ 90 mmHg among the measured BPs on all encounters from the date when patients met the hypertension criteria up to and including the index encounter (Fig. 1).In approach 2, hypertensive patients were considered to have controlled hypertension when both SBP < 140 mmHg and DBP < 90 mmHg at the index encounter.Hypertensive patients were considered to have uncontrolled hypertension when either the SBP was ≥ 140 mmHg or the DBP was ≥ 90 mmHg at the index encounter.Since approach 1 used multiple BP Fig. 1 Cohort Definitions for Controlled and Uncontrolled Hypertension.Footnote: The red dot on the graph indicates an ambulatory encounter selected randomly at least three months after hypertension identification and between October 1, 2015, and December 31, 2018, serving as the index encounter.We employed two different approaches to determine controlled hypertension among the hypertensive patients.In approach 1, controlled hypertension was defined as having more than 50% of systolic blood pressure (SBP) measurements below 140 mmHg and diastolic blood pressure (DBP) measurements below 90 mmHg across all ambulatory encounters, from the identification date up to and including the index encounter.In approach 2, controlled hypertension was defined as having both SBP < 140 mmHg and DBP < 90 mmHg at the index encounter measurements over time, it comprises the primary analysis while approach 2 is the sensitivity analysis.We performed two sensitivity analyses to assess the robustness of our results.In the first analysis, we defined controlled hypertension as having more than 50% of SBP measurements below 130 mmHg and DBP measurements below 80 mmHg among all measured BPs recorded during ambulatory encounters, starting from the date when patients met the hypertension criteria and continuing up to and including the index encounter.This threshold was chosen based on the 2017 American Heart Association/ American College of Cardiology hypertension guidelines to assess the robustness of the primary findings [19].In the second sensitivity analysis, we employed a different threshold.Here, controlled hypertension was defined as having more than 75% of SBP measurements below 140 mmHg and DBP measurements below 90 mmHg among all measured blood pressures recorded during ambulatory encounters, starting from the date when patients met the hypertension criteria and continuing up to and including the index encounter (Supplemental Figure S1).

Baseline characteristics
Baseline demographic and clinical characteristics of patients included age, race, ethnicity, sex, health insurance type, smoking status, body mass index [BMI] and comorbidities.Race was categorized as Black, White, other(s), and unknown.Ethnicity was categorized as Hispanic, non-Hispanic, and unknown.Comorbidities included heart failure, diabetes mellitus, history of acute myocardial infarction, coronary artery disease, cerebrovascular disease, stroke, atrial fibrillation or flutter, chronic kidney disease, chronic obstructive pulmonary disease, dyslipidemia, peripheral arterial disease, angina, depression, dementia, hypertensive retinopathy, and substance use disorder.
Characteristics using a set time point such as age were defined based on the index encounter.If data for a specific characteristic were not available from the index encounter (e.g., smoking status), we used the most recent data available prior to the index date.Characteristics such as insurance status, which may change across encounters, were defined based on the index encounter.Comorbidities were defined using ICD-10-CM codes based on the 1-year period prior to the index date (see details in Supplemental Table S1).

Classification of antihypertensive medications
To properly classify EHR-based prescription drug data into antihypertensive therapeutic indication and antihypertensive drug classes, we used a previously developed antihypertensive drug classification system based off RxNorm Concept Unique Identifiers (RxCUIs) [20].We included only oral formulations, with the exception of transdermal clonidine patches.We classified antihypertensive medications into major drug classes, including angiotensin-converting enzyme inhibitors (ACEI), angiotensin receptor blockers (ARB), beta blockers, calcium channel blockers (CCB), thiazide or thiazide-like diuretics, and other antihypertensive drugs.For combination drugs, we classified them into the multiple component classes of the combination drugs.The list of drug ingredient in each antihypertensive drug class was presented in Supplemental Table S2.

Short-term and long-term outcomes
We examined pre-specified short-term outcomes at 3 and 6 months and long-term outcomes at 12 and 24 months after the index date.Outcomes assessed in this study included clinical outcomes (the composite of death and non-fatal cardiovascular disease [CVD] events) and healthcare utilization (ED visits and hospitalizations for any cause; ambulatory visits for any cause).Non-fatal CVD events were defined as any diagnosis of an acute myocardial infarction (AMI), heart failure, atrial fibrillation/flutter, aortic dissection, renal disease, hemorrhagic stroke, ischemic stroke, or hypertensive crisis at an ED or inpatient visit.Of note, we included only acute event codes, including both primary and secondary diagnosis codes, for outcome ascertainment.We excluded CVD events reported at ambulatory encounters because of the inability to reliably distinguish patients with acute CVD events from those with history of prior CVD.Death was identified through a combination of reported death records in the EHR, a death diagnosis at any visit, and encounters with a discharge status of expired.Social Security Death Master File were also used to identify mortality data.ICD-10-CM diagnosis codes for clinical outcomes are listed in Supplemental Table S3 [21].As longer follow-up periods are likely required to comprehensively assess the complete range of outcomes associated with hypertension, it is important to note that our examination of long-term outcomes at 24 months is conducted as an exploratory analysis within this study.

Statistical analyses
We first calculated the prevalence of controlled and uncontrolled hypertension among all patients with hypertension, respectively.We described the demographic and clinical characteristics of the hypertensive population overall and by controlled vs. uncontrolled status.
We then described the number and class of antihypertensive medications prescribed both in the year prior to the index date and on the index date among overall hypertensive patients and by controlled vs. uncontrolled status.We also described the three most prescribed antihypertensive medications among patients using 1, 2, and 3 or more antihypertensive medications.Finally, we described the frequency and percentage of patient outcomes and healthcare utilization at 3, 6, 12 and 24 months among overall hypertensive patients and by controlled vs. uncontrolled status.For the analysis of patient characteristics, antihypertensive medication prescriptions, and outcomes at 3, 6, and 12 months, we included individuals with a follow-up period of more than three months but less than 24 months.However, we did not include them in the analysis of outcomes at 24 months due to insufficient follow-up data.To mitigate the concern of potential censoring, patients with less than 3 months of follow-up were excluded to ensure sufficient data for robust analysis.For patients with incomplete follow-up data (i.e., those who did not experience the event of interest during the study period), their data were censored at the last recorded visit or appointment in the EHR.Administrative censoring was applied in December 2019, marking the end of the observation period.The incidence rate of cardiovascular events was calculated for patients with controlled and uncontrolled hypertension.If patients had multiple visits during the follow-up period, they were counted only once in the numerator for each time window.
Comparisons between uncontrolled and controlled hypertensive patients for characteristics, treatment, and outcomes were performed using appropriate tests, including Pearson's chi-square test for normally distributed continuous variables, the Wilcoxon signed rank test for non-normally distributed continuous variables, the McNemar test for 2*2 categorical variables and the generalized Mantel-Haenszel test for 2*n categoric variables (where n> 2).All analyses were conducted individually at each site using a decentralized model [22]; summary results were shared across researchers from the two sites, with no patient-level data shared.All statistical analyses were performed using SAS software version 9.4 (SAS institute, Cary, NC, USA) and Statistical package R version 3.6.

Results
In total, our study included 514,687 adult patients from YNHHS and 1,075,204 adult patients from OneFlorida who had at least one ambulatory visit with recorded BP data between October 1, 2015, and December 31, 2018, as depicted in Fig. 2. At OneFlorida, 224,534 patients were diagnosed with hypertension based on diagnosis codes, and 135,230 patients were identified through elevated BP measurements.Similarly, at YNHHS, hypertension was identified in 346,994 patients based on diagnosis codes and in 167,693 patients through elevated BP measurements.We attributed overlapping patients to the group based on the criteria that was met first.After excluding additional patients with less than three months of followup, female patients with pregnancy-related diagnostic evidence, and patients receiving dialysis, 253,207 patients with hypertension from YNHHS and 182,827 from One-Florida were included in the final analysis (Supplemental Table S4).At YNHHS, the mean age of patients was 65.0 years (SD = 14.6) years and 47.8% of patients were men; 12.6% of patients were Black, 76.2% were White, and 9.0% were Hispanic.At OneFlorida, the mean age of patients was 61.0 years (SD: 14.7) years and 44.8% of patients were men; 25.2% of patients were Black, 47.7% were White, and 15.4% were Hispanic.

Prevalence and characteristics of uncontrolled hypertension
In our primary analysis using approach 1, we discovered that uncontrolled hypertension was prevalent, affecting 32.5% of patients at YNHHS and 43.7% at OneFlorida.We observed that patients with uncontrolled hypertension typically belonged to younger age groups and were more likely to be male and of Black race.Additionally, a higher proportion of these patients preferred speaking Spanish or other non-English languages.Notably, these patients also exhibited higher rates of obesity and smoking compared to those with controlled hypertension (P < 0.01; refer to Table 1 for detailed statistics).Interestingly, despite these risk factors, patients with uncontrolled hypertension presented with fewer comorbidities overall.

Medication prescription patterns
At YNHHS, 62.1% of patients with hypertension, including 60.7% of those with uncontrolled hypertension and 62.7% of those with controlled hypertension (p = 0.56), were not prescribed any antihypertensive drugs in the year prior to the index date (Table 2).Among all patients with hypertension, ACEIs or ARBs were prescribed in 19.8% of the patients in the year prior to the index date, followed by beta-blockers (15.3%) and CCBs (11.6%).At OneFlorida, 62.0% of patients with hypertension, including 57.8% of those with uncontrolled hypertension and 65.2% of those with controlled hypertension (p < 0.001), were not prescribed any antihypertensive drugs in the year prior to the index date.Among all patients with hypertension, ACEIs or ARBs were prescribed in 22.7% of the patients in the year prior to the index date, followed by CCBs (12.9%) and thiazide or thiazide-like diuretics (12.3%).Only 5.3% of all patients with hypertension at both YNHHS and OneFlorida sites were prescribed single-pill combination antihypertensive drugs.Similarly, over 50% of patients with hypertension were not prescribed any antihypertensive drugs on the index date.This was consistent across age, sex, and controlled/ uncontrolled hypertension subgroups at both YNHHS and OneFlorida sites (Table 3).Among patients prescribed at least one antihypertensive drug, 40-50% of patients at YNHHS and 50%-60% of patients at OneFlorida were prescribed one drug class, 20-30% at YNHHS and OneFlorida were prescribed drugs from two drug classes and 10-20% at YNHHS and OneFlorida were prescribed three or more drug classes.
Among adults prescribed one antihypertensive medication class on the index date, ACEI or ARBs was the most prescribed class at both YNHHS and OneFlorida (34.3% at YNHHS and 40.5% at OneFlorida; Table 4).For YNHHS, the second most prescribed medication class was beta-blockers (28.4%) followed by CCBs (18.9%).For OneFlorida, the second most prescribed medication class was CCBs (19.8%) followed by beta blockers (18.8%).Among adults prescribed two antihypertensive drug classes, ACEI or ARB and thiazide diuretic were most common (25.8% at YNHHS and 33.1% at OneFlorida).Among patients using three or more antihypertensive drug classes, ACEI or ARB, CCB and thiazide diuretic were most common (16.9% at YNHHS and 20.9% at OneFlorida).
Using approaches 1 and 2, we identified a significantly overlapping population.These methods resulted in an 86.4% overlap of the population at YNHHS and an 86.8% overlap at OneFlorida (Supplemental Table S5).The results of sensitivity analysis using approach 2 where we defined controlled and uncontrolled hypertension based on a single BP measurement at the index visit were reported in Supplemental Tables S6-S10.The results for patient characteristics, medication prescription patterns, and the incidence rate of cardiovascular events for patients with controlled and uncontrolled hypertension are consistent across both approaches.Additionally, the results of the sensitivity analysis using a BP threshold of 130/80 mmHg for hypertension control were reported in Supplemental Tables S11-S16.As expected, using a lower BP threshold identified more patients with uncontrolled hypertension.However, the patterns of patient characteristics, medication prescriptions, and the incidence rates of cardiovascular events were generally consistent with those observed in the main analysis.

Discussion
Our study applied two computable definitions to EHR data from two large clinical research networks, YNHHS and OneFlorida, to identify and characterize patient populations with controlled and uncontrolled hypertension.The two computable definitions generated consistent results.Approximately 30-40% of hypertensive patients receiving ambulatory care within both health system networks have uncontrolled hypertension, of whom 60% were untreated.We were also able to characterize shortterm and long-term outcomes among patients with both controlled and uncontrolled hypertension.These findings lay a foundation for more sophisticated analyses to assess the quality of care and outcomes for patients with hypertension in future studies.
A strength of this study was the successful use of a decentralized model for clinical research.Both YNHHS and OneFlorida retained their data behind their individual firewalls, but data were managed using common definitions and data models that enabled harmonized research using federated analytics.Conducting clinical research using federated models enables aggregation of observations across multiple health systems, thereby examining a much larger and diverse population size of patients than when using data from a single health system.The consistent overall results that we found across both YNHHS and OneFlorida suggest that a reusable infrastructure can be created for digital population health surveillance and identification of people with hypertension who would benefit from more aggressive management.
Several challenges were encountered during the study, as well as insights that have led us to conclude that they are all addressable.An overall challenge was accurately defining and identifying a condition-specific population, in this case patients with uncontrolled hypertension.To use EHR data to perform high-quality clinical research, construction of accurate patient cohorts is vital.This is particularly important for uncontrolled hypertension, for which there is no specific diagnostic code and identification usually requires many observations over time.

Table 3 Number of antihypertensive medication classes prescribed on the index date among patients with hypertension, according to age and sex
1,097 ( ACEI or ARB and Beta Blocker and CCB 1,722 ( Importantly, previous studies have shown that diagnosis codes used in isolation generally do not have sufficient accuracy for cohort identification.Even for a straightforward diagnosis such as hypertension, approximately 30% of the people identified with hypertension by BP measurements recorded in the EHR were missing the associated diagnostic code [23,24].We found a similar proportion of hypertensive patients did not have associated diagnostic code.One approach to enhance the robustness of results is to use different operational definitions of uncontrolled hypertension and assess the consistency of the outcomes.In our study, we observed consistent findings for patient characteristics, medication prescription patterns, and cardiovascular event incidence rates across both approaches.This consistency provides greater confidence in the robustness of our findings.With the increasing emphasis on ambulatory and home BP monitoring [25,26], additional data sources may be available to better understand the management of hypertension when these data are integrated with the EHR.
Second, using health system data to classify antihypertensive medications and examine patterns of medication prescription has challenges.This is because many medications have multiple indications and dosage forms, and the existing therapeutic classification systems generally group medications in ways that may only partially correlate with intended use.For example, timolol is a betablocker that has both oral and ophthalmic dosage forms.The oral form is used to treat hypertension, whereas the ophthalmic form is used to treat glaucoma [27,28].Therefore, just the presence of a drug entity in the prescription records may not be sufficient to accurately classify medications being used for hypertension treatment.A solution is to use a set of standardized drug codes and names for use in querying EHR data for antihypertensive medication prescriptions [20].This approach allowed us to properly identify antihypertensive medications, assign each medication to a medication class, and apply consistent definitions across multiple health systems.Of note, we found over 50% of patients with controlled hypertension were not on antihypertensive medications.Likely, these individuals were able to achieve their BP goals through non-pharmacologic means.Lifestyle modifications, such as adopting a healthy diet, engaging in regular physical activity, and reducing stress, have been shown to have a positive impact on BP management.It is also possible that these patients were effectively treating and managing underlying medical conditions that contribute to elevated BP, such as obstructive sleep apnea, chronic kidney disease, or hormonal disorders.In addition, the distribution of prescribed antihypertensive medications varied across health systems, as the specific selection of medication depends on multiple factors.For instance, diuretics may be favored for hypertensive patients experiencing fluid retention, while beta-blockers might be more suitable for those with a history of heart disease or arrhythmia.Similarly, hypertensive patients with diabetes or chronic kidney disease may prefer ACE inhibitors or ARBs due to their additional renal protective effects.Moreover, the choice of antihypertensive medication can be influenced by the preferences and familiarity of the prescribing physician with different medication classes.Some physicians may possess greater expertise in certain medications or prefer those with fewer side effects and better tolerability profiles.Third, there were pros and cons of using the primary discharge diagnosis codes versus secondary diagnosis codes to identify the outcomes of interest across health systems.Using primary discharge diagnosis codes for hospitalizations for CVD events like stroke may be less likely to have misclassification than codes from ambulatory visits.However, some events may be missed by reliance solely on primary diagnosis codes, particularly when there are concurrent diagnoses.On the other hand, including secondary diagnoses may lead to greater capture of events, but it may lead to too much noise resulting from the inability to distinguish patients with acute strokes from those with history of prior stroke.The approach we used in this study was to include only acute event codes -whether or not they were in the primary diagnosis position -for outcome ascertainment.Another common solution for improving accuracy of outcome ascertainment is to validate the diagnosis codes against manual chart review, as showed in prior EHR studies [29].While our study did not perform chart review due to the limited scope of work, comparing the diagnostic codes or algorithms with clinician review of EHRs to determine extent of concordance between codes and clinical judgement may be necessary to evaluate and improve the validity of codes or algorithms.There is also a critical need to ensure that these methods are consistent across different sites within the distributed research model.Of note, it is crucial to recognize that the present study adopts a descriptive design and does not aim to evaluate the association between hypertension control and clinical outcomes.As a result, the controlled and uncontrolled hypertension groups may exhibit different demographic or clinical characteristics that were not accounted for in the outcome analysis.The controlled hypertension group might have been composed of individuals who were more proactive in managing their condition and adhering to treatment regimens.This self-selection bias could indicate that these patients were generally more engaged in their health, leading to higher healthcare utilization and subsequent identification of clinical events.Another plausible explanation is that patients with more severe or complicated health conditions were prioritized for intensive treatment and achieved controlled hypertension.Therefore, the higher clinical outcomes observed in this group could be attributed to their underlying medical complexity rather than the effect of blood pressure control itself.Finally, it is possible that unmeasured or unknown confounders influenced both the choice of treatment strategy and the clinical outcomes.

Limitations
There are several limitations in this study.First, there may be variations in methods and devices used to measure BP across and within the two health systems.Measurement of BP in a clinical practice setting may not mirror that of a trial or be performed per best practices.Second, we only used prescribing data to evaluate antihypertensive medications and do not have information on whether the prescriptions were filled or taken by the patients.Third, we used ED or inpatient encounters in the EHR to define clinical outcomes, which presumes that patients were hospitalized at the given health system of interest.For acute events such as myocardial infraction and stroke, patients are often taken by ambulance to the nearest hospital, which may not always be within the YNHHS or OneFlorida network.Thus, there may be incomplete ascertainment of acute events in EHRs.Fourth, we assessed the antihypertensive medications prescribed to patients with hypertension during the year preceding and including the index date.However, some patients might have been prescribed medications after the index date and we did not assess changes in the medication prescription over time.Our approach potentially leads to an underestimation of the overall treatment rate in patients with hypertension.For patients who were not prescribed medications, it is possible that they were either following lifestyle modifications or not receiving any antihypertensive therapy due to diagnosed conditions.However, since lifestyle modifications are often documented in unstructured clinical notes in the EHR, we were unable to access that information.Finally, since the analysis of cardiovascular outcomes is exploratory and participants may lack sufficient follow-up data, we performed only simple descriptive analyses and did not develop models to quantify hazard ratios between uncontrolled BP and clinical outcomes.

Conclusions
This study underscores the promising role of real-world health system data, gathered during routine clinical care, for use in clinical research.Our findings illustrate the potential of leveraging EHR data, employing computable definitions, to conduct effective digital population surveillance in the realm of hypertension management.This approach shows promise in identifying patients with uncontrolled hypertension who might benefit from additional medical interventions.Furthermore, our research brings to light the inherent challenges associated with utilizing health system data for research purposes, and outlines strategies to navigate these challenges effectively.These insights contribute significantly to the evolving field of real-world data application, offering a foundation for generating high-quality evidence that can inform decisions by regulators, clinicians, and patients.While our study indicates the feasibility and utility of these computational definitions in EHR data, future validation studies are needed to confirm their accuracy and reliability comprehensively.

Fig. 2
Fig. 2 Diagram for study population selection

Table 1
Baseline characteristics of patients with hypertension at the index encounter

Table 2
Antihypertensive medication classes prescribed for patients with hypertension in the year prior to the index date

Table 4
Top three commonly prescribed antihypertensive medication classes on the index date among treated patients with hypertensiona ACEI Angiotensin-converting enzyme inhibitor, ARB Angiotensin receptor blocker, CCB Calcium channel blocker

Table 5
Rates of death, non-fatal CVD events, and healthcare utilization, among patients with uncontrolled and controlled hypertension at two health systems at 3, 6, 12, 24 months after the index date a All clinical outcomes include the composite of death and non-fatal CVD events